Decoding Anagrammed Texts Written in an Unknown Language and Script

نویسندگان

  • Bradley Hauer
  • Grzegorz Kondrak
چکیده

Algorithmic decipherment is a prime example of a truly unsupervised problem. The first step in the decipherment process is the identification of the encrypted language. We propose three methods for determining the source language of a document enciphered with a monoalphabetic substitution cipher. The best method achieves 97% accuracy on 380 languages. We then present an approach to decoding anagrammed substitution ciphers, in which the letters within words have been arbitrarily transposed. It obtains the average decryption word accuracy of 93% on a set of 50 ciphertexts in 5 languages. Finally, we report the results on the Voynich manuscript, an unsolved fifteenth century cipher, which suggest Hebrew as the language of the document.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Effectiveness of Shadow-Reading With and Without Written Script on Listening Comprehension of Iranian Intermediate EFL Students.

Listening comprehension is at the heart of language learning (Kurita, 2012). It is an importantlanguage skill to develop in terms of second language acquisition (SLA) (Dunkel, 1991; Rost,2001; Vandergrift, 2007).In spite of its importance, L2 learners often regard listening as themost difficult language skill to learn. In this study, shadowing as an act or task in listening, inwhich the learner...

متن کامل

The Effectiveness of Shadow-Reading With and Without Written Script on Pronunciation of Iranian Intermediate EFL Students

Pronunciation is essential to appropriate communication because the incorrect use ofpronunciation inevitably leads to the message being misunderstood by the receptor. In spite of itsimportance, L2 learners often regard pronunciation as the most difficult language skill to learn.In this study, shadowing as an act or task in pronunciation, in which the learner tracks the targetspeech and repeats ...

متن کامل

Script and Text in Time and Space

The project Script and Text in Time and Space will create a new foundation for working with medieval Danish texts by creating and deepening fundamental knowledge about the development of medieval script in Denmark as well as the dating and localization of relevant text bearing objects (manuscripts and other document types). The project is placed in the field of Digital Humanities, meaning that ...

متن کامل

Unsupervised Analysis of the Voynich Manuscript

The aim of this project is to research the possibilities of applying unsupervised learning techniques for natural language and other sequential data to undeciphered texts and manuscripts. The undeciphered text used is the Voynich Manuscript, a mysterious book from the 15th or 16th century that is written in an unknown script. Some methods that could be applied to manuscripts such as these will ...

متن کامل

Metadiscourse Markers Revisited in EFL Context: The Case of Iranian Academic Learners’ Perception of Written Texts

Moving in line with the postulation that metadiscourse (MD) markers help transform a dry and tortuous piece of text into a coherent and reader-friendly one, the researchers in the current study attempted to investigate the effect different metadiscourse markers might have on Iranian EFL learners’ perception of written texts. To this end, 120 undergraduate English students were given three diffe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • TACL

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2016